Search results for "Latent semantic analysis"
showing 10 items of 40 documents
Mix and Match Features: Relevance Feedback and Combined Similarity Metrics
2001
Feature Dimensionality Reduction for Mammographic Report Classification
2016
The amount and the variety of available medical data coming from multiple and heterogeneous sources can inhibit analysis, manual interpretation, and use of simple data management applications. In this paper a deep overview of the principal algorithms for dimensionality reduction is carried out; moreover, the most effective techniques are applied on a dataset composed of 4461 mammographic reports is presented. The most useful medical terms are converted and represented using a TF-IDF matrix, in order to enable data mining and retrieval tasks. A series of query have been performed on the raw matrix and on the same matrix after the dimensionality reduction obtained using the most useful techni…
Geometric Algebra Rotors for Sub-symbolic Coding of Natural Language Sentences
2007
A sub-symbolic encoding methodology for natural language sentences is presented. The procedure is based on the creation of an LSA-inspired semantic space and associates rotation operators derived from Geometric Algebra to word bigrams of the sentence. The operators are subsequently applied to an orthonormal standard basis of the created semantic space according to the order in which words appear in the sentence. The final rotated basis is then coded as a vector and its orthogonal part constitutes the sub-symbolic coding of the sentence. Preliminary experimental results for a classification task, compared with the traditional LSA methodology, show the effectiveness of the approach.
Sub-symbolic Encoding of Words
2003
A new methodology for sub-symbolic semantic encoding of words is presented. The methodology uses the WordNet lexical database and an ad hoc modified Sammon algorithm to associate a vector to each word in a semantic n-space. All words have been grouped according to the WordNet lexicographers’ files classification criteria: these groups have been called lexical sets. The word vector is composed by two parts: the first one, takes into account the belonging of the word to one of these lexical sets; the second one is related to the meaning of the word and it is responsible for distinguishing the word among the other ones of the same lexical set. The application of the proposed technique over all…
Semantic Computing of Moods Based on Tags in Social Media of Music
2014
Social tags inherent in online music services such as Last.fm provide a rich source of information on musical moods. The abundance of social tags makes this data highly beneficial for developing techniques to manage and retrieve mood information, and enables study of the relationships between music content and mood representations with data substantially larger than that available for conventional emotion research. However, no systematic assessment has been done on the accuracy of social tags and derived semantic models at capturing mood information in music. We propose a novel technique called Affective Circumplex Transformation (ACT) for representing the moods of music tracks in an interp…
A KST-BASED SYSTEM FOR STUDENT TUTORING
2008
Abstract: A novel assessment procedure based on knowledge space theory (KST) is presented along with a complete implementation of an intelligent tutoring system. (ITS) that has been used to test our theoretical findings. The key idea is that correct assessment of the student knowledge is strictly related to the structure of the domain ontology. Suitable relationships between the concepts must be present to allow the creation of a reverse path from the "knowledge state" representing the student goal to the one that contains her actual knowledge about this topic. Knowledge space theory is a very good framework to guide the process of building the ontology used, by the artificial tutor The sys…
A Geometric Approach to Automatic Description of Iconic Scenes
2005
It is proposed a step towards the automatic description of scenes with a geometric approach. The scenes considered are composed by a set of elements that can be geometric forms or iconic representation of objects. Every icon is characterized by a set of attributes like shape, colour, position, orientation. Each scene is related to a set of sentences describing its content. The proposed approach builds a data driven vector semantic space where the scenes and the sentences are mapped. Sentences and scene with the same meaning are mapped in near vectors and distance criteria allow retrieving semantic relations.
Convergence of Web 2.0 and Semantic Web: A Semantic Tagging and Searching System for Creating and Searching Blogs
2007
The work presented in this paper aims to combine Latent Semantic Analysis methodology, common sense and traditional knowledge representation in order to improve the dialogue capabilities of a conversational agent. In our approach the agent brain is characterized by two areas: a "rational area", composed by a structured, rule-based knowledge base, and an "associative area", obtained through a data- driven semantic space. Concepts are mapped in this space and their mutual geometric distance is related to their conceptual similarity. The geometric distance between concepts implicitly defines a sub-symbolic relationship net, which can be seen as a new "sub- symbolic semantic layer" automaticall…
Predicting Word Maturity from Frequency and Semantic Diversity: A Computational Study
2016
Semantic word representation changes over different ages of childhood until it reaches its adult form. One method to formally model this change is the word maturity paradigm. This method uses a text sample for each age, including adult age, and transforms the samples into a semantic space by means of Latent Semantic Analysis. The representation of a word at every age is then compared with its adult representation via computational maturity indices. The present study used this paradigm to explore to the impact of word frequency and semantic diversity on maturation indices. To do this, word maturity indices were extracted from a Spanish incremental corpus and validated, using correlation scor…
Semantic Word Error Rate for Sentence Similarity
2016
Sentence similarity measures have applications in several tasks, including: Machine Translation, Paraphrase Iden- tification, Speech Recognition, Question-answering and Text Summarization. However, measures designed for these tasks are aimed at assessing equivalence rather than resemblance, partly departing from human cognition of similarity. While this is reasonable for these activities, it hinders the applicability of sentence similarity measures to other tasks. We therefore propose a new sentence similarity measure specifically designed for resemblance evaluation, in order to cover these fields better. Experimental results are discussed.